69 research outputs found

    P2CS: a database of prokaryotic two-component systems

    Get PDF
    P2CS (http://www.p2cs.org) is a specialized database for prokaryotic two-component systems (TCSs), virtually ubiquitous signalling proteins which regulate a wide range of physiological processes. The primary aim of the database is to annotate and classify TCS proteins from completely sequenced prokaryotic genomes and metagenomes. Information within P2CS can be accessed through a variety of routes—TCS complements can be browsed by metagenome, replicon or sequence cluster (and these genesets are available for download by users). Alternatively a variety of database-wide or taxon-specific searches are supported. Each TCS protein is fully annotated with sequence-feature information including replicon context, while properties of the predicted proteins can be queried against several external prediction servers to suggest homologues, interaction networks, sub-cellular localization and domain complements. Another unique feature of P2CS is the analysis of ORFeomes to identify TCS genes missed during genome annotation. Recent innovations for P2CS include a CGView representation of the distribution of TCS genes around a replicon, categorization of TCS genes based on gene organization, an expanded domain-based classification scheme, a P2CS ‘gene cart’ and categorization on the basis of sequence clusters

    P2RP:a Web-based framework for the identification and analysis of regulatory proteins in prokaryotic genomes

    Get PDF
    BACKGROUND: Regulatory proteins (RPs) such as transcription factors (TFs) and two-component system (TCS) proteins control how prokaryotic cells respond to changes in their external and/or internal state. Identification and annotation of TFs and TCSs is non-trivial, and between-genome comparisons are often confounded by different standards in annotation. There is a need for user-friendly, fast and convenient tools to allow researchers to overcome the inherent variability in annotation between genome sequences. RESULTS: We have developed the web-server P2RP (Predicted Prokaryotic Regulatory Proteins), which enables users to identify and annotate TFs and TCS proteins within their sequences of interest. Users can input amino acid or genomic DNA sequences, and predicted proteins therein are scanned for the possession of DNA-binding domains and/or TCS domains. RPs identified in this manner are categorised into families, unambiguously annotated, and a detailed description of their features generated, using an integrated software pipeline. P2RP results can then be outputted in user-specified formats. CONCLUSION: Biologists have an increasing need for fast and intuitively usable tools, which is why P2RP has been developed as an interactive system. As well as assisting experimental biologists to interrogate novel sequence data, it is hoped that P2RP will be built into genome annotation pipelines and re-annotation processes, to increase the consistency of RP annotation in public genomic sequences. P2RP is the first publicly available tool for predicting and analysing RP proteins in users’ sequences. The server is freely available and can be accessed along with documentation at http://www.p2rp.org

    A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

    Get PDF
    BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. RESULTS: We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. CONCLUSION: The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations

    P2TF: a comprehensive resource for analysis of prokaryotic transcription factors

    Get PDF
    BACKGROUND: Transcription factors (TFs) are DNA-binding proteins that regulate gene expression by activating or repressing transcription. Some have housekeeping roles, while others regulate the expression of specific genes in response to environmental change. The majority of TFs are multi-domain proteins, and they can be divided into families according to their domain organisation. There is a need for user-friendly, rigorous and consistent databases to allow researchers to overcome the inherent variability in annotation between genome sequences. DESCRIPTION: P2TF (Predicted Prokaryotic Transcription Factors) is an integrated and comprehensive database relating to transcription factor proteins. The current version of the database contains 372,877 TFs from 1,987 completely sequenced prokaryotic genomes and 43 metagenomes. The database provides annotation, classification and visualisation of TF genes and their genetic context, providing researchers with a one-stop shop in which to investigate TFs. The P2TF database analyses TFs in both predicted proteomes and reconstituted ORFeomes, recovering approximately 3% more TF proteins than just screening predicted proteomes. Users are able to search the database with sequence or domain architecture queries, and resulting hits can be aligned to investigate evolutionary relationships and conservation of residues. To increase utility, all searches can be filtered by taxonomy, TF genes can be added to the P2TF cart, and gene lists can be exported for external analysis in a variety of formats. CONCLUSIONS: P2TF is an open resource for biologists, allowing exploration of all TFs within prokaryotic genomes and metagenomes. The database enables a variety of analyses, and results are presented for user exploration as an interactive web interface, which provides different ways to access and download the data. The database is freely available at http://www.p2tf.org/

    P2CS:Updates of the prokaryotic two-component systems database

    Get PDF
    International audienceThe P2CS database (http://www.p2cs.org/) is a comprehensive resource for the analysis of Prokaryotic Two-Component Systems (TCSs). TCSs are comprised of a receptor histidine kinase (HK) and a partner response regulator (RR) and control important prokaryotic behaviors. The latest incarnation of P2CS includes 164 651 TCS proteins, from 2758 sequenced prokaryotic genomes. Several important new features have been added to P2CS since it was last described. Users can search P2CS via BLAST, adding hits to their cart, and ho-mologous proteins can be aligned using MUSCLE and viewed using Jalview within P2CS. P2CS also provides phylogenetic trees based on the conserved signaling domains of the RRs and HKs from entire genomes. HK and RR trees are annotated with gene organization and domain architecture, providing insights into the evolutionary origin of the contemporary gene set. The majority of TCSs are encoded by adjacent HK and RR genes, however, 'orphan' unpaired TCS genes are also abundant and identifying their partner proteins is challenging. P2CS now provides paired HK and RR trees with proteins from the same genetic locus indicated. This allows the appraisal of evolutionary relationships across entire TCSs and in some cases the identification of candidate partners for orphan TCS proteins

    P2CS: a two-component system resource for prokaryotic signal transduction research

    Get PDF
    BACKGROUND: With the escalation of high throughput prokaryotic genome sequencing, there is an ever-increasing need for databases that characterise, catalogue and present data relating to particular gene sets and genomes/metagenomes. Two-component system (TCS) signal transduction pathways are the dominant mechanisms by which micro-organisms sense and respond to external as well as internal environmental changes. These systems respond to a wide range of stimuli by triggering diverse physiological adjustments, including alterations in gene expression, enzymatic reactions, or protein-protein interactions. DESCRIPTION: We present P2CS (Prokaryotic 2-Component Systems), an integrated and comprehensive database of TCS signal transduction proteins, which contains a compilation of the TCS genes within 755 completely sequenced prokaryotic genomes and 39 metagenomes. P2CS provides detailed annotation of each TCS gene including family classification, sequence features, functional domains, as well as genomic context visualization. To bypass the generic problem of gene underestimation during genome annotation, we also constituted and searched an ORFeome, which improves the recovery of TCS proteins compared to searches on the equivalent proteomes. CONCLUSION: P2CS has been developed for computational analysis of the modular TCSs of prokaryotic genomes and metagenomes. It provides a complete overview of information on TCSs, including predicted candidate proteins and probable proteins, which need further curation/validation. The database can be browsed and queried with a user-friendly web interface at

    Modulation of Metabolism and Switching to Biofilm Prevail over Exopolysaccharide Production in the Response of Rhizobium alamii to Cadmium

    Get PDF
    Heavy metals such as cadmium (Cd2+) affect microbial metabolic processes. Consequently, bacteria adapt by adjusting their cellular machinery. We have investigated the dose-dependent growth effects of Cd2+ on Rhizobium alamii, an exopolysaccharide (EPS)-producing bacterium that forms a biofilm on plant roots. Adsorption isotherms show that the EPS of R. alamii binds cadmium in competition with calcium. A metabonomics approach based on ion cyclotron resonance Fourier transform mass spectrometry has showed that cadmium alters mainly the bacterial metabolism in pathways implying sugars, purine, phosphate, calcium signalling and cell respiration. We determined the influence of EPS on the bacterium response to cadmium, using a mutant of R. alamii impaired in EPS production (MSΔGT). Cadmium dose-dependent effects on the bacterial growth were not significantly different between the R. alamii wild type (wt) and MSΔGT strains. Although cadmium did not modify the quantity of EPS isolated from R. alamii, it triggered the formation of biofilm vs planktonic cells, both by R. alamii wt and by MSΔGT. Thus, it appears that cadmium toxicity could be managed by switching to a biofilm way of life, rather than producing EPS. We conclude that modulations of the bacterial metabolism and switching to biofilms prevails in the adaptation of R. alamii to cadmium. These results are original with regard to the conventional role attributed to EPS in a biofilm matrix, and the bacterial response to cadmium

    ATM-Mediated Transcriptional and Developmental Responses to γ-rays in Arabidopsis

    Get PDF
    ATM (Ataxia Telangiectasia Mutated) is an essential checkpoint kinase that signals DNA double-strand breaks in eukaryotes. Its depletion causes meiotic and somatic defects in Arabidopsis and progressive motor impairment accompanied by several cell deficiencies in patients with ataxia telangiectasia (AT). To obtain a comprehensive view of the ATM pathway in plants, we performed a time-course analysis of seedling responses by combining confocal laser scanning microscopy studies of root development and genome-wide expression profiling of wild-type (WT) and homozygous ATM-deficient mutants challenged with a dose of γ-rays (IR) that is sublethal for WT plants. Early morphologic defects in meristematic stem cells indicated that AtATM, an Arabidopsis homolog of the human ATM gene, is essential for maintaining the quiescent center and controlling the differentiation of initial cells after exposure to IR. Results of several microarray experiments performed with whole seedlings and roots up to 5 h post-IR were compiled in a single table, which was used to import gene information and extract gene sets. Sequence and function homology searches; import of spatio-temporal, cell cycling, and mutant-constitutive expression characteristics; and a simplified functional classification system were used to identify novel genes in all functional classes. The hundreds of radiomodulated genes identified were not a random collection, but belonged to functional pathways such as those of the cell cycle; cell death and repair; DNA replication, repair, and recombination; and transcription; translation; and signaling, indicating the strong cell reprogramming and double-strand break abrogation functions of ATM checkpoints. Accordingly, genes in all functional classes were either down or up-regulated concomitantly with downregulation of chromatin deacetylases or upregulation of acetylases and methylases, respectively. Determining the early transcriptional indicators of prolonged S-G2 phases that coincided with cell proliferation delay, or an anticipated subsequent auxin increase, accelerated cell differentiation or death, was used to link IR-regulated hallmark functions and tissue phenotypes after IR. The transcription burst was almost exclusively AtATM-dependent or weakly AtATR-dependent, and followed two major trends of expression in atm: (i)-loss or severe attenuation and delay, and (ii)-inverse and/or stochastic, as well as specific, enabling one to distinguish IR/ATM pathway constituents. Our data provide a large resource for studies on the interaction between plant checkpoints of the cell cycle, development, hormone response, and DNA repair functions, because IR-induced transcriptional changes partially overlap with the response to environmental stress. Putative connections of ATM to stem cell maintenance pathways after IR are also discussed

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
    • …
    corecore